Protecting Genomic Privacy by a Sequence-Similarity Based Obfuscation Method
نویسندگان
چکیده
In the post-genomic era, large-scale personal DNA sequences are produced and collected for genetic medical diagnoses and new drug discovery, which, however, simultaneously poses serious challenges to the protection of personal genomic privacy. Existing genomic privacy-protection methods are either time-consuming or with low accuracy. To tackle these problems, this paper proposes a sequence similarity-based obfuscation method, namely IterMegaBLAST, for fast and reliable protection of personal genomic privacy. Specifically, given a randomly selected sequence from a dataset of DNA sequences, we first use MegaBLAST to find its most similar sequence from the dataset. These two aligned sequences form a cluster, for which an obfuscated sequence was generated via a DNA generalization lattice scheme. These procedures are iteratively performed until all of the sequences in the dataset are clustered and their obfuscated sequences are generated. Experimental results on two benchmark datasets demonstrate that under the same degree of anonymity, IterMegaBLAST significantly outperforms existing state-of-the-art approaches in terms of both utility accuracy and time complexity.
منابع مشابه
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data Privacy
Genomic data provides clinical researchers with vast opportunities to study various patient ailments. Yet the same data contains revealing information, some of which a patient might want to remain concealed. The question then arises: how can an entity transact in full DNA data while concealing certain sensitive pieces of information in the genome sequence, and maintain DNA data utility? As a re...
متن کاملA Formal Model of Obfuscation and Negotiation for Location Privacy
Obfuscation concerns the practice of deliberately degrading the quality of information in some way, so as to protect the privacy of the individual to whom that information refers. In this paper, we argue that obfuscation is an important technique for protecting an individual’s location privacy within a pervasive computing environment. The paper sets out a formal framework within which obfuscate...
متن کاملProtecting Location Privacy through Semantics-aware Obfuscation Techniques
The widespread adoption of location-based services (LBS) raises increasing concerns for the protection of personal location information. To protect location privacy the usual strategy is to obfuscate the actual position of the user with a coarse location and then forward the obfuscated location to the LBS provider. Existing techniques for location obfuscation are only based on geometric methods...
متن کاملA Novel Reversible De-Identification Approach For Lossless Image Compression Based On Reversible Watermarking Mechanism Based On Obfuscation Process
Although tremendous progress has been made in the past years on watermarking for protecting information from incidental or accidental hacking, there still exists a number of problems. De-Identification is a process which can be used to ensure privacy by concealing the identity of individuals captured by video surveillance systems. One important challenge is to make the obfuscation process rever...
متن کاملPrivacy Games: Optimal User-Centric Data Obfuscation
Consider users who share their data (e.g., location) with an untrusted service provider to obtain a personalized (e.g., location-based) service. Data obfuscation is a prevalent user-centric approach to protecting users’ privacy in such systems: the untrusted entity only receives a noisy version of user’s data. Perturbing data before sharing it, however, comes at the price of the users’ utility ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1708.02629 شماره
صفحات -
تاریخ انتشار 2017